Fix: ModelBuilder deployment & optimization of JumpStart llama-3.1 models #4937

cj-zhang · 2024-11-21T00:20:55Z

Description of changes:

Allow deploying JumpStart models w/gated draft models through ModelBuilder
Fixing regression on Fast Model Loading on GoldFinch
Fix compilation and quantization of Llama 3.1 models

Testing done: E2E notebook testing, UTs and integ tests.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

I have read the CONTRIBUTING doc
I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
I used the commit message format described in CONTRIBUTING
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

I have added tests that prove my fix is effective or that my feature works (if appropriate)
I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
I have checked that my tests are not configured for a specific region or account (if appropriate)
I have used unique_name_from_base to create resource names in integ tests (if appropriate)
If adding any dependency in requirements.txt files, I have spell checked and ensured they exist in PyPi

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

src/sagemaker/model.py

src/sagemaker/serve/builder/model_builder.py

Lokiiiiii · 2024-11-22T05:01:35Z

src/sagemaker/jumpstart/utils.py

-            mutable_model_data_source.pop(
-                "HostingEulaKey"
-            )  # pop when model access config is not applicable
+            if "HostingEulaKey" in mutable_model_data_source:


Can we add a test for this in test_model_builder as well in addition to what you already have ? This is a very critical path in ModelBuilder

tests/integ/sagemaker/serve/test_serve_js_deep_unit_tests.py under test_js_model_with_optimize_speculative_decoding_config_gated_requests_are_expected

…ally

gwang111 · 2024-11-22T09:47:14Z

flake8 installed: flake8==4.0.1,flake8-future-import==0.4.6,mccabe==0.6.1,pycodestyle==2.8.0,pyflakes==2.4.0
flake8 run-test-pre: PYTHONHASHSEED='42'
flake8 run-test: commands[0] | flake8
_____________________________________________________________________________________________________________________ summary _____________________________________________________________________________________________________________________
  flake8: commands succeeded
  congratulations :)

Flake8 is passing on my end with the same versions etc.

mufaddal-rohawala · 2024-11-22T12:06:09Z

RTD build has passed, not sure why it's not reflecting baack in PR check, overriding for now.

Joseph Zhang added 2 commits November 20, 2024 16:15

Emit warning when cpu cores are requested with sharded model deployment.

1bddf72

Reformat sharded model validations.

ffd3178

cj-zhang requested a review from a team as a code owner November 21, 2024 00:20

cj-zhang requested a review from nargokul November 21, 2024 00:20

cj-zhang had a problem deploying to manual-approval November 21, 2024 00:21 — with GitHub Actions Error

gwang111 reviewed Nov 21, 2024

View reviewed changes

src/sagemaker/model.py Show resolved Hide resolved

gwang111 previously approved these changes Nov 21, 2024

View reviewed changes

fix pop on none error in jumpstart draft model flow

180e4d2

gwang111 dismissed their stale review via 180e4d2 November 21, 2024 00:49

gwang111 had a problem deploying to manual-approval November 21, 2024 00:49 — with GitHub Actions Error

set lmi config on js model optimize

8b96155

gwang111 had a problem deploying to manual-approval November 21, 2024 22:49 — with GitHub Actions Error

cj-zhang changed the title ~~Emit warning when cpu cores are requested w/sharded model deployment.~~ Fix: ModelBuilder deployment & optimization of JumpStart llama-3.1 models Nov 22, 2024

re-format lmi config switch

33dcf96

gwang111 temporarily deployed to manual-approval November 22, 2024 01:05 — with GitHub Actions Inactive

Lokiiiiii suggested changes Nov 22, 2024

View reviewed changes

add e2e UT for lmi + .optimize()

37af43a

gwang111 temporarily deployed to auto-approve November 22, 2024 07:20 — with GitHub Actions Inactive

add e2e UT for lmi + .optimize() no override

c5eac9a

gwang111 temporarily deployed to auto-approve November 22, 2024 07:47 — with GitHub Actions Inactive

add deep UTs to catch regressions and test E2E fully and more practic…

7860704

…ally

gwang111 temporarily deployed to auto-approve November 22, 2024 09:34 — with GitHub Actions Inactive

work around flake8 bug

bec70a0

gwang111 temporarily deployed to auto-approve November 22, 2024 10:00 — with GitHub Actions Inactive

flake8 workaround

a52144e

gwang111 temporarily deployed to auto-approve November 22, 2024 10:07 — with GitHub Actions Inactive

fix flake8 syntax error in py38

a467016

gwang111 temporarily deployed to auto-approve November 22, 2024 10:30 — with GitHub Actions Inactive

mufaddal-rohawala approved these changes Nov 22, 2024

View reviewed changes

mufaddal-rohawala merged commit 801db44 into aws:master Nov 22, 2024
13 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: ModelBuilder deployment & optimization of JumpStart llama-3.1 models #4937

Fix: ModelBuilder deployment & optimization of JumpStart llama-3.1 models #4937

Uh oh!

cj-zhang commented Nov 21, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Lokiiiiii Nov 22, 2024

Uh oh!

gwang111 Nov 22, 2024

Uh oh!

gwang111 commented Nov 22, 2024 •

edited

Loading

Uh oh!

mufaddal-rohawala commented Nov 22, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix: ModelBuilder deployment & optimization of JumpStart llama-3.1 models #4937

Fix: ModelBuilder deployment & optimization of JumpStart llama-3.1 models #4937

Uh oh!

Conversation

cj-zhang commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge Checklist

General

Tests

Uh oh!

Uh oh!

Uh oh!

Lokiiiiii Nov 22, 2024

Choose a reason for hiding this comment

Uh oh!

gwang111 Nov 22, 2024

Choose a reason for hiding this comment

Uh oh!

gwang111 commented Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mufaddal-rohawala commented Nov 22, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cj-zhang commented Nov 21, 2024 •

edited

Loading

gwang111 commented Nov 22, 2024 •

edited

Loading